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A unitary interaction coupling two parties enables quantum or classical communication in both 
the forward and backward directions. Each communication capacity can be thought of as a tradeoff 
between the achievable rates of specific types of forward and backward communication. Our first 
result shows that for any bipartite unitary gate, bidirectional coherent classical communication is 
no more difficult than bidirectional classical communication — they have the same achievable rate 
regions. Previously this result was known only for the unidirectional capacities (i.e., the boundaries 
of the tradeoff). We then relate the tradeoff for two-way coherent communication to the tradeoff 
■ for two-way quantum communication and the tradeoff for coherent communication in one direction 

' and quantum communication in the other. 
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I. INTRODUCTION 



Quantum communication theory typically studies channels which take an input quantum system from one party (call 
her Alice), act on it possibly with some noise (a trace preserving completely positive map[l]) and pass the system onto 
another party (call him Bob) . A quantum channel can generate quantum or classical communication or entanglement 
at some rate. The maximum rate at which each task can be done with arbitrary precision and with an asymptotically 
large number of channel uses is called the capacity. 

A bipartite unitary gate coupling Alice and Bob can achieve similar tasks, with either party (or both) in the role of 
sender or receiver. Early studies can be found in [2-5], focusing on more specific systems and protocols. For example, 
a CNOT can send a classical bit from Alice to Bob, or from Bob to Alice or generate one EPR pair. Asymptotic 
capacities of a general bipartite unitary evolution to communicate and to generate entanglement were formalized in 
Ref. [6]. A general expression for the entanglement capacity was found in Refs. [6, 7] and that for entanglement- 
assisted one-way classical capacity was found in Ref. [6]. Expressions for various one-way quantum capacities were 
subsequently found in Ref. [8], by introducing the concepts of coherent classical communication and entanglement 
t-^ ' recycling. (Their precise definitions, as well as concepts throughout the rest of this paragraph, will be clarified in 
Sec. II). In particular, Ref. [8] showed that for any gate, the one-way classical capacity is equal to its one-way coherent 
capacity. This further provides an expression for the one-way classical capacity assisted by any linear amount of free 
q-( entanglement, and allows the one-way quantum capacity and the remote state preparation capacity to be expressed in 
terms of this one-way classical capacity. 

However, the core result for bipartite unitary evolution in Ref. [8], the equality of the one-way classical capacity 
and the coherent capacity, is left open for simultaneous two-way communication. Our main result is a proof of 
this equality in Sec. III. For completeness, we also compare two-way classical communication and coherent classical 
communication in the regime of negative communication rates (i.e., consuming communication to help produce other 
resources). Following similar arguments as in Ref. [8], we list some corollaries. These are the two-way remote state 
preparation capacity and quantum capacity in terms of the classical capacity. Our main result is proved by using a 
coherent version of a one-time pad (analogous to that in Ref. [9]). The reason why a more direct extension of the 
proof from Ref. [8] fails is given in an appendix. A second appendix discusses the implications our results have on the 
definition of coherent classical communication. 



II. FRAMEWORK, DEFINITIONS, AND NOTATIONS 



Throughout the paper, we consider communication between two parties, Alice and Bob. Systems in their possession 
are denoted by respective subscripts A, Ao,i,... and B, Bo,].,.... System labels are omitted when they are clear from 
the context. We also use superscripts (A) and (B) for different (but analogous) objects related to Alice and Bob (for 
example, their respective local operations). Exp and log are always base 2. We will primarily use the trace distance 
\\\p — cr 1 1 1 to quantify the proximity of any two states p and a, where ||A||i := Tr yXXX. For two pure states |cn), |/3), 
\ || |a)(a|-|/3)(/3| ||i = e & |(/3|a)| 2 = 1 - e 2 . We use \a) « \(3) as a shorthand for \ \\ \a) (a\-\/3) ((3\ \\i < e. 
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We now review some definitions and background results, mostly from Refs. [6, 8, 10]. Let {\x) }a;=o,i be a basis for 
C 2 . We first define various resources. Let an ebit denote a unit of shared quantum correlation, as quantified by an 
EPR pair |$) AB — ^75 ^2 x =o \ x )a\ x )b- Throughout the paper, we omit the tensor product symbol, ®, if no confusion 
may arise. Following Ref. [8], we denote the ability to communicate a qubit in the forward direction (from Alice to 
Bob) as qubit (— >), and mathematically, it corresponds to the isometry |a;) A — > \x) B . Qubit communication in the 
opposite direction, the isometry |a:) B — > \x) A , is denoted qubit(<— ). Nonunitary evolution can be viewed as a unitary 
evolution between all participating parties, together with an inaccessible one called the environment denoted by E. 
Then, the ability to communicate a classical bit in the forward direction, denoted as cbit(— >), is given by the linear 
map |x) A — ► |a;) B |a;) E . In contrast, a cobit(— >) is given by the map |a;) A — > |x) A |a;) B . A cbit(<— ) and a cobit(<— ) are 
defined similarly. We call cobits coherent classical communication, and cbits incoherent classical communication or 
simply classical communication. One can view cobits as cbits in which Alice is given the environment E as quantum 
feedback. The results of this paper imply that cobits may be equivalently defined as the ability to send cbits through 
unitary means. In Appendix B we will make this idea precise. 

Communication theory is primarily concerned with converting available resources into desired ones. Roughly speaking, 
given two communication resources X and Y, we say that X > rY if X can be transformed into Y asymptotically 
and approximately at rate r, i.e., V<5>0, 3N such that Vn> JV, n copies (or uses) of X can be transformed into 
> n(r — 5) copies (or uses) of Y, in an approximate manner to be defined. For example, Shannon's noisy coding 
theorem [11] for a classical channel (i.e. a stochastic map) T could be stated as T > C(T) cbits, where C(T) := 
maxp( S ) [H(E)+H(T(E))—H(E,T(E))] is the classical capacity of the channel T, H(-) is the entropy of a random 
variable, and the maximization is over all distribution P(S) of the input alphabet S. If X > Y and Y > X, then we 
write that X = Y. For example, the reverse Shannon theorem [12] states that C(T) cbits > T, so that T\ = T 2 
for any two classical channels T\, Ti (in the presence of unlimited shared randomness). Another result [8] of this type, 
2cobits(^) = 1 ebit +1 qubit (— >), will be used in Sec. IV to relate the classical and quantum capacities of unitary 
gates. 

The definition for X > rY is only complete given an error definition, and a good one should ensure transitivity 
of resource inequalities: X > rY and Y > sZ implies X > rsZ. Operationally, the two corresponding resource 
transformations should be sufficiently accurate to be composable. Mathematically, we say that X > rY if there 
exist vanishing sequences of nonnegative numbers, {e n }, {$n}, and protocols V n each using X at most n times (and 
other allowed resources), such that V n « y®( r -> 5 «)™. Here the notion of approximation w is extended from states to 
operations as 

V|V> \ || J®K(|V))-X®y^- 5 ")"(|V)) ||i<e n , (1) 

where I denotes the identity operation on a reference system of dimension given by the input to V n . Including a 
reference system in Eq. (1) ensures that V n and Y®( r ~ 5n "> n transform correlations similarly. Here, we use the symbol Y 
to denote the associated state transformation enabled by the resource (see Sec. I for examples). We will see examples 
of what the above means in the next section. 

We can now define the achievable classical rate region of a unitary gate U as the set of points (C\ , C2 , E) such that 
U > C\ cbits(^) + C2 cbits(^) + -Eebits. When C\, C2, or E is negative, it means that the resource is being consumed; 
for example, if E < and C\,Gi > 0, then U + (— E) ebits > C\ cbits(— >) + C2 cbits(<— ) represents entanglement- 
assisted communication. This paper is mostly concerned with C\,C<2. > and arbitrary E. Part of the (C\,C2,E) 
achievable region has been characterized, for the special cases of C\,Ci < (entanglement capacity [6, 7] which is 
not increased by free classical communication), C2 = 0, E = —00 (one-way classical communication with unlimited 
entanglement assistance [6], though the actual protocol requires only finite entanglement assistance) and C2 = 
(one-way classical communication with arbitrary entanglement assistance [8]). We can define the achievable coherent 
classical rate region of U analogously as the triples (C\, C2, E) so that U > C\ cobits(^) + Ci cobits(^) + debits. 

Reference [8] showed that U > Ccbits(^) + debits if and only if U > Ccobits(^) + debits, i.e., the coherent 
and incoherent classical rate regions coincide on the planes C\ = and C2 = 0. In the next section we prove that 
the coherent and incoherent rate regions are identical in the entire C\,Ci > quadrant. Other quadrants will be 
considered for completeness - this amounts to understanding how to best use back classical communication. We will 
see that assistance by cobits only generates entanglement and that cbits are useless. We then apply the result to 
relate the capacity regions of different types of forward and backward communication. 
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III. BIDIRECTIONAL COHERENT CLASSICAL COMMUNICATION 

Theorem 1. For any bipartite unitary or isometry U and C±, C 2 > 0, 

U ^ Cicbits(-») + C 2 cbits(<-) + debits iff (2) 
U ^ Ci cobits(^) + C" 2 cobits(<-) + debits (3) 

Proof: Since lcobit > lcbit, it suffices to prove the forward implication. In other words, given the existence 
of protocols achieving the resource transformation in Eq. (2), we will construct protocols that achieve the resource 
transformation in Eq. (3). We delay the discussion for E ^ until the end of this section. For now, suppose E = 0. 

• The definition ofV n 

Formally, Eq. (2) indicates the existence of sequences of nonncgative real numbers {e„}, {S n } satisfying e„, S n — > as 
n — > 00; a sequence of protocols V n = {V n ®W n ) U ■ ■ ■ U {V\®W\) U (Vo®Wo), where Vj,Wj are local isometries that 
may also act on extra local ancilla systems, and sequences of integers C^.C^ satisfying nC\ > C[ n>> > n(Ci—5 n ), 
nC 2 > C 2 ™' > n(C 2 — <5„), such that the following success criterion holds. 

Let a e {0, l} c 'i ' and b e {0, l} '? ' be the respective messages of Alice and Bob. Let \<p a b) '■= F n (\a) Ai \b) B ). Note 
that \ifiab) generally occupies a space of larger dimension than Ai ® Bi since V n may add local ancillas. To say that 
V n can transmit classical messages, we require that local measurements on \ip a b) can generate messages b' for Alice 
and a! for Bob according to a distribution Pr(a'6'|a6) such that 

Va.b ]T \ I Pr(a'6'|o6) - 6 a , a ,6 bM \ < e n (4) 

a', 6' 

where a',b' are summed over {0, l} c i ' and {0, 1}°? ' respectively. Eq. (4) follows from applying Eq. (1) to classical 
communication, taking the final state to be the distribution of the output classical messages. Since any measurement 
can be implemented as a joint unitary on the system and an added ancilla, up to a redefinition of V n ,W n , we can 
assume 

Wab) ■= V n (\a) Al \b) Bi ) = J2 kt^B, (5) 

a',b> 

where the dimensions of Ai and Hi are interchanged by V n , and |7„/'(/) are subnormalized states with Pr(o'6'|a6) := 

(To'Vl'Yo'V) satisfying Eq. (4). Thus, for each a, b most of the weight of \(p a b) is contained in the I7"'*) term, 
corresponding to error- free transmission of the messages. See Fig. 1(a). 

• The three main ideas for turning classical communication into coherent classical communication 

We first give an informal overview of the construction and the intuition behind it. For simplicity, consider the error- 
free term with Ij^'b) m ^ 2 To see why classical communication via unitary means should be equivalent to coherent 
classical communication, consider the special case when |7"' 6 )a 2 b 2 is independent of a, b. In this case, copying a, b 
to local ancilla systems A ,B before V n and discarding A 2 B 2 after V n leaves a state w \b) Ai \a) A Ja) Bi \b) Ba — the 

desired coherent classical communication. See Fig. 1(b). In general |7a'fc)A 2 B 2 will carry information about a,b, 
so tracing A 2 B 2 will break the coherence of the classical communication. Moreover, if the Schmidt coefficients of 
l7ab)A 2 b 2 depend on a, b, then knowing a, b is not sufficient to coherently eliminate |7"'£)a 2 b 2 without some additional 
communication. The remainder of our proof is built around the need to coherently efiminate this ancilla. 

Our first strategy is to encrypt the classical messages a, 6 by a shared key, in a manner that preserves coherence (similar 
to that in Ref. [9]). The coherent version of a shared key is a maximally entangled state. Thus Alice and Bob (1) 
again copy their messages to A ,B , then (2) encrypt, (3) apply V n , and (4) decrypt. Encrypting the message makes 
it possible to (5) almost decouple the message from the combined "key-and-ancilla" system, which is approximately in 
a state |r o) independent of a, b (exact definitions will follow later). (6) Tracing out |r o) gives the desired coherent 
communication. Let V' n denote steps (l)-(5) (see Fig. 1(c)). 

If entanglement were free, then our proof of Theorem 1 would be finished. However, we have borrowed c[ n] *+C { 2 n) 
ebits as the encryption key and replaced it with |r o) - Though the entropy of entanglement has not decreased (by 
any significant amount), |r o) is not directly usable in subsequent runs of V' n . To address this problem, we use a 
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FIG. 1: Schematic diagrams for Vn and V' n . (a) A given protocol Vn for two-way classical communication. The output is a 
superposition (over all a' , b') of the depicted states, with most of the weight in the (a , b') = (a, b) term. The unlabeled output 
systems in the state |7*/'(,,) are A2, B2. (b) The same protocol with the inputs copied to local ancillas Ao, Bo before Vn- If 
is independent of a,b, two-way coherent classical communication is achieved, (c) The five steps of V' n . Steps (l)-(4) are shown 
in solid lines. Again, the inputs are copied to local ancillas, but Vn is used on messages encrypted by a coherent one-time-pad 
(the input |<i)ai is encrypted by the coherent version of the key |x)a 3 and the output |a'©a;)Bi is decrypted by |x)b 3 ; similarly, 
|6)b! is encrypted by |«/)b 4 and \b' ®y)\ 1 decrypted by |2/)a 4 - The intermediate state is shown in the diagram. Step (5), shown 
in dotted lines, decouples the messages in Aq,i, Bo,i from A2,3,4, B2,3,4, which is in the joint state very close to IToo). 



second strategy of running k copies of V' n in parallel and performing entanglement concentration of |r o) (g> ' c using 
the techniques of [13]. For sufficiently large k, with high probability, we recover most of the starting ebits. The 

regenerated ebits can be used for more iterations of V'® k to offset the cost of making the initial k ( C^+C^ ) ebits, 
without the need of borrowing from anywhere. 

However, a technical problem arises with simple repetition of V' n , which is that errors accumulate. In particular, a 
naive application of the triangle inequality gives an error ke n but k, n are not independent. In fact, the entanglement 
concentration procedure of Ref. [13] requires k ^> Sch(|Foo)) = exp(0(n)) and we cannot guarantee that ke n — > as 
k,n — > 00. Our third strategy is to treat the k uses of V' n as k uses of a slightly noisy channel, and encode only I 
messages (each having C[ n \ bits in the two directions) using classical error correcting codes. The error rate then 
vanishes with a negligible reduction in the communication rate and now making no assumption about how quickly a n 
approaches zero. We will see how related errors in decoupling and entanglement concentration are suppressed. 

We now describe the construction and analyze the error in detail. 
• The definition ofV n 

0. Alice and Bob begin with inputs |a) Al |&} Bl ano ^ the entangled states |3>)a 3 b 3 and |$)a 4 b 4 - (Systems 3 and 4 
hold the two separate keys for the two messages a and b.) The initial state can then be written as 

"7= E B 3 E >A4 B 4 l«> Al l fc > Bl ( 6 ) 



N , 

where x and y are summed over {0, 1} C ^ and {0, l} c 2 n) , and N = exp ( C[ n) +C^ ] ) ■ 

1. They coherently copy the messages to A ,B . 

2. They encrypt the messages using the one-time-pad \a) Ai \x) As — > \a © a;) Ai l^)^ an( i I^)bi If )b 4 ~ * 1^ ® ^)bi |2/)b 4 
coherently to obtain 



a )A„l fo > B „ ^El a; )A 3 ly)A 4 N)B 3 ly)B 4 \a®x) Ai \b®y) Bi . (7) 



N ,„ 



3. Using U n times, they apply V n to registers A x and Bi, obtaining an output state 

l a >A» Bo ^E \ x )a 3 \v)kMK\vK E l & '©y)Aj«'©-) Bl |7Sey)A 2 B 2 • (8) 



TV ■ 

xy a' ,b' 



4. Alice decrypts her message in Ai using her key A4 and Bob decrypts Bi using B 3 coherently as \b'(By)A 1 \v)a 4 — * 
|&'>Ai|j/>A4 and \a' ® x) Bi \x)b 3 -» |a') Bl |a;)B 3 producing a state 

l fl >A» Bo -4E k)A 3 b)A 4 k) B3 |y) B4 E \ b ')AM%KZX> 2 B 2 ■ (9) 

V xy a'.b' 
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5. Further CNOTs Ai — > A 4 , A — > A3, Bi — » B 3 and B — > B 4 will leave A 2 ,3,4 and 62,3,4 almost decoupled from 
the classical messages. To see this, the state has become 



\ a )A \b) Bo E I & '>a>'> Bi -^ E l°® ^AsK © x )bJV © y) A J6© 2/) b J7S;^)a 2 b 2 

a', 6' * xj/ 

= l°> A |6>B E I 6 ') A, l«V |raea'W> A2 , 3i4 ^ , (10) 
a' ,b' 

where 

|raea', & e6') A2 , 3 , 4 B 2 , 3l4 : = ^ E 1° ® *> A, 1°' © ^B, I 5 ' © 2/>A 4 i& © 2/>B 4 ffi&A, B 2 • (11) 



XI/ 



The fact \T a ^ a i^b') depends only on a a' and frffi b', without any other dependence on a and 6, can be easily 
seen by replacing x,y with a © x,b y in ^ in the RHS of the above. Note that (T a ^ a i^®b'Wa(Ba',b®b') = 
77 J2xy P r ( a ' © x , b' © y | a © x, 6 © y) , so in particular for the state corresponding to the error-free term, we have 
(r o|r o> - i j: xy Pv(xy\xy) := 1 - e n > 1 - e n [14]. 

Suppose that Alice and Bob could project onto the space where a' — a and 6' = &, and tell each other they 
have succeeded (by using a little extra communication); then the resulting ancilla state -^===|r o) has at 

least + + log(l—e n ) ebits, since its largest Schmidt coefficient is < [ exp(cf^+C2™')(l— e n ) ] ^ 2 and 
e« < £«■ (A similar state was studied in Ref. [6] in the proof that the entanglement capacity of a unitary gate 
was at least as large as its classical communication capacity.) Furthermore, |roo) is manifestly independent of 
a, b. We will see how to improve the probability of successful projection onto the error free subspace by using 
block codes for error correction, and how correct copies of |r o) can be identified if Alice and Bob can exchange 
a small amount of information. 



• Main idea on how to perform error correction 

As discussed before, |T o) cannot be used directly as an encryption key - our use of entanglement in V' n is not catalytic. 
Entanglement concentration of many copies of |r o) obtained from many runs of V' n will make the entanglement 
overhead for the one-time-pad negligible, but errors will accumulate. The idea is to suppress the errors in many uses 
of V' n by error correction. This has to be done with care, since we need to simultaneously ensure low enough error 
rates in both the classical message and the state to be concentrated, as well as sufficient decoupling of the classical 
messages from other systems. 

Our error-corrected scheme will have k parallel uses of T' n , but the k inputs are chosen to be a valid codeword of an 
error correcting code. Furthermore, for each use of V' n , the state in A2,3,4 B2,3,4 will only be collected for entanglement 
concentration if the error syndrome is trivial for that use of V' n . We use the fact that errors occur rarely (at a rate of 
e n , which goes to zero asn^ oo) to show that (1) most states are still used for concentration, and (2) communicating 
the indices of the states with non trivial error syndrome requires a negligible amount of communication. 

• Definition ofV!^ k : error corrected version of {T" n )® k with entanglement concentration 

We construct two codes, one used by Alice to signal to Bob and one from Bob to Alice. We consider high distance 
codes. The distance of a code is the minimum Hamming distance between any two codewords, i.e. the number of 
positions in which they are different. 

First consider the code used by Alice. Let N\ — 2 c i ' . Alice is coding for a channel that takes input symbols from 
[JVi] := {1, . . . , N\} and has probability < e„ of error on any input (the error rate depends on both a and b). We would 
like to encode [Ni] 1 in [A^] fc using a code with distance 2ka n , where a n is a parameter that will be chosen later. Such 
a code can correct up to any [ka n — ij errors (without causing much problem, we just say that the code corrects ka n 

errors). Using standard arguments [18], we can construct such a code with I > k [ 1— 2a n — H 2 (2a n ) / c[ n ^ ], where 

H2(p) = —p^ogp — (1— p) log(l— p) is the binary entropy. The code used by Bob is chosen similarly, with N 2 = 2°? 
input symbols to each use of V' n . For simplicity, Alice's and Bob's codes share the same values oil, k and a n . We 
choose a n > max(l/C 1 ™\ l/C^) so that I > k(l— 3a n ). 

Furthermore, we want the probability of having > ka n errors to be vanishingly small. This probability is 
< exp(— kD(a n \\e n )) < cxp(fc + ka n loge„) (using arguments from [19]) < exp(— k) if a n > —2/ loge„. 

Using these codes, Alice and Bob construct V^ k as follows (with steps 1-3 performed coherently). 
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0. Let (a°, • • • , a°) be a vector of I messages each of c[ n ^ bits, and (6°, ■ • • , b°) be I messages each of bits. 

1. Using her error correcting code, Alice encodes (a°, • • • , a°) in a valid codeword a — (oi, • • • , afe) which is a 
A:- vector. Similarly, Bob generates a valid codeword b = (b\, ■ ■ ■ ,bk) using his code. 

2. Let Ai := Af fe denote a tensor product of k input spaces each of c[ n ^ qubits. Similarly, Bi := Bf fc . (Wc 
will also denote k copies of Ao,2,3,4, and Bo, 2, 3, 4 by adding the vector symbol.) Alice and Bob apply (V' n )® k 
to |o)xJ^)bi' that is, in parallel, they apply V' n to each pair of inputs (dj,bj). The resulting state is a tensor 
product of states of the form given by Eq. (10): 



KVcJ^Bo E I^AiK^Bi |r 04 e a ^6 ; ,.e6pA 2 ,3,4B 2 ,3,4 ■ ( 12 ) 



Define |r 5e3 , ife? ,)A 234 B 234 : = <8>j=i l r a 3 ©a^^e6;.)A 2 , 3 , 4 B 2 , 3>4 . Then, Eq. (12) can be written more succinctly as 

I«)a I&>b E I^A! l r aes',?e6')A 23 4B 234 • ( 13 ) 

a', 6' 

3. Alice performs the error correction step on Ai and Bob does the same on Bi. According to our code constructions, 
this (joint) step fails with probability pf a u < 2 • 2~ fc . (We will see below why pf a u is independent of a and b.) 

In order to describe the residual state, we now introduce Ga = {x<E [N{\ k : \x\ <ka n } and Gb = {x£ [A 2 ] fc : 
\x\ < ka n }, where \x\ := \{j : Xj 7^ 0}| denotes the Hamming weight of x. Thus Ga,b are sets of correctable (good) 
errors, in the sense that there exist local decoding isometries T>a,T>b such that for any code word a G [Ni] k wc 
have Vfl' £ offi Ga,T>a\(?) = \a)\a®a') (and similarly, if b € [N 2 ] k is a codeword, then W e b © Sb,^b|&') = 
1 6) 1 6 © 6')). For concreteness, let the decoding maps take Ai to A1A5 and Bi to B1B5. 
Conditioned on success, Alice and Bob are left with 

1 \a,b) K \a,b)s V V \b®b') K \a®a') n \T^,r^,) s n (14) 

y/l — Pfail ' A ,i 1 ' 'Bo,i / < L 1 1 'A5 1 'B 5 I a®a' ,b®b' I A234B234 v y 

S'eaeg A b<eb®G B 

■= 7/1=^^^, £ E I ?/ )a 5 I«")bJ^,^)a 23 4B 2 34' US) 

where we have defined a" :— a © a' and 6" := b © 6'. Note that 2~ fe+1 > pf a n = J2(a" b")^g A xg B Ps" b"\^S" 6")' 
which is manifestly independent of a, b. The ancilla is now completely decoupled from the message, resulting in 
coherent classical communication. The only remaining issue is recovering entanglement from the ancilla, so for 
the remainder of the protocol we ignore the now decoupled states \a, b)^ a i \a, 6)g Q i . 



4. For any x, define S(x) := {j : xj y^O} to be set of positions where x is nonzero. If x € Ga (or Gb), then 



|5(f)| < fca n . Thus, S*(f) can be written using < log£) - <fca (*) < log ( fe * ) + log(fca„) < kH 2 (a n ) + log(fca n ) 



bits. 

The next step is for Alice to compute IS^fo")) from \b") and communicate it to Bob using (kH2(a n ) + 
log(fca„)) cbits(— >). Similarly, Bob sends \S(a")) to Alice using (fci/ 2 (o:n) + log(fca n )) cbits(<— ). Here we need 
to assume that some (possibly inefficient) protocol to send 0(k) bits in either direction with error exp(— k— 1) 
(chosen for convenience) and with Rk uses of U for some constant R. Such a protocol was shown in Ref. [6] and 
the bound on the error can be obtained from the HSW theorem [16]. 

Alice and Bob now have the state 

-=L= £ £ \S(af')S(b")) A6 |6")a 5 \S(S")S(b")) §6 |o")b 5 l r - P'^iw ( 16 ) 

al a"ee A J'' e g B 

Conditioning on their knowledge of S(a"), S(b"), Alice and Bob can now identify fc' > k(l — 2a n ) positions 
where a" = b'j = 0, and extract k' copies of -^==|roo). Note that leaking S(a"), S(b") to the environment 
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will not affect the extraction procedure, therefore, coherent computation and communication of S(a"), S(b") 
is unnecessary. (We have not explicitly included the environment's copy of \S(a")S(b")) in the equations to 
minimize clutter.) After extracting k! copies of -— ==|r o), we can safely discard the remainder of the state, 

which is now completely decoupled from both [ ^l pt I loo) ] ^ an< ^ tnc mcssa g e |&)a |&)aj |&)b \^)b 1 ■ 
5. Alice and Bob perform entanglement concentration £ conc (using the techniques of Ref. [13]) on [ . = [Too) 1 ^ ■ 

L V ^ Pfail ^ 

Note that since -— L=|r o) can be created using U n times and then using classical communication and 
postselection, it must have Schmidt rank < Sch(£/) n , where Sch(J7) is the Schmidt number of the gate 
U [20]. Also recall that E \ |r 00 )l > C[ n) + Ci n) + log(l-e n ). According to Ref. [13], £ conc re- 

V -L Pfail J 

quires no communication and with probability > 1 — exp [ — Sch({/)™ (Vfc 7 — log(fc'+l)) ] produces at least 
k' [ C[ n ) +C^ n) +log(l-e„) ] - Sch(U) n [ log(fc'+l) ] ebits. 

• Error and resource accounting 
V" k consumes a total of 

(0) nk uses of U (in the k executions of V n ) 

(1) Rk uses of U (for communicating nontrivial syndrome locations) 

(2) k [C[ n ^ +C2™'] ebits (for the encryption of classical messages). 

"P"fc produces, with probability and fidelity no less than 1 - 2 • 2~( fe ~ 1 ) - exp [— Sch(f7) n (Vfc 7 - log(fe'+l))] , at least 

(1) ic[ n ) cobits(^) + ZC 2 " ) cobits(^) 

(2) k' (c[ n) +C { 2 n) +log(l-e n )) -Sch(C/)"(VF-log(fc'+l)) ebits. 

We restate the constraints on the above parameters: e ni 5 n — > as n — > 00; C[ n ^ > n(C\—5 n ), > ^(6*2— S n ); 

a n > max(l/C^ ) ,l/C^ ) ,-2/loge„); fc' > fc(l-2a„); / > A(l-3a„). 

We define "error" to include both infidelity and the probability of failure. To leading orders of k, n, this is equal to 
2~(k-2) _|_ CX p [—y^ Sch([/)"] . We define "inefficiency" to include extra uses of U, net consumption of entanglement, 
and the amount by which the coherent classical communication rates fall short of the classical capacities. To lead- 
ing order of k,n, these are respectively Rk, 2a n k(c[ n) +C^ n) ) + VkSch(U) n w 2a n kn(C 1 +C 2 ) + \fk Sch([/) n , and 

nk{C\+C2) — l(C[ n ^ +6*2™^) < nfc(3a„(Ci+C2) + 2£„). We would like the error to vanish, as well as the fractional 
inefficiency, defined as the inefficiency divided by kn, the number of uses of U. Equivalently, we can define f(k,n) 
to be the sum of the error and the fractional inefficiency, and require that f(k, n) — > as nk — > 00. By the above 
arguments, 

f(k, n) < 2-< fc - 2 > + cxp(-Vk Sch(*7)") + 2a n (C 1 +C 2 ) + -± Sch(£/)" + - + 3a„(Ci+C 2 ) + 2S n . (17) 

Ti V K 

Note that for any fixed value of n, lim^oo f(k,n) = 5a n (Ci+C2) + 2S n + R/n. (This requires k to be sufficiently 
large and also k 3> Sch(/7) 2 ".) Now, allowing n to grow, we have 

lim lim f(k,n) = 0. (18) 

n — >oo k — >oo 

The order of limits in this equation is crucial due to the dependence of k on n. 

The only remaining problem is our catalytic use of 0(nk) ebits. In order to construct a protocol that uses only U, we 
need to first use U 0{nk) times to generate the starting entanglement. Then we repeat V„ m times, reusing the same 
entanglement. The catalyst results in an additional fractional inefficiency of c/m (for some constant c depending only 
on U) and the errors and inefficiencies of T 3 " add up to no more than mf(k,n). Choosing m = |1/ ' \J f(k,n)\ will 
cause all of these errors and inefficiencies to simultaneously vanish. More generally, 

Q 

lim lim lim mf(k,n)-\ = 0. (19) 



This proves the resource inequality 



U > Ci cobits(^) + C 2 cobits(H- (20) 
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• The E < and E > cases 

If E < then entanglement is consumed in Pn, so there exists a sequence of integers E^ < n(E + 5 n ) such that 



^(|a> Al |6> Bl |<<B 6 ) = £ l&VI«Vl7 a v,>A 2 B 2 . (21) 



o',b' 



In this case, the analysis for E^ = goes through, only with additional entanglement consumed. Almost all equations 
are the same, except now the Schmidt rank for |Too) is upper- bounded by [Sch(U)2 E+s ™] n instead of Sch(£/)". In 
particular, previous arguments still give Eq. (18) from the modified Eq. (17). 

If instead E > 0, entanglement is created, so for some > n(E — <5„) we have 

Pn(|a> Al |6) Bi ) = Yl Ma^VO^b, • (22) 

a' ,b' 

for E(\j^)a 2 b 2 ) > -E^"-*. Again, the previous construction and analysis go through, with an extra E^> cbits of 
entanglement of entropy in |r o), and thus an extra fractional efficiency of < 2a n E in Eq. (17). The Schmidt rank of 
|r o) is still upper bounded by Sch(J7)" in this case. □ 

So far, we have focused on the Ci,C 2 > quadrant. The following theorem will relate the achievable regions for 
coherent and incoherent classical communication when C\ < or C 2 < 0. 

Theorem 2. For any bipartite unitary or isometry U and Ci,C 2 > 0, 

C 2 cbits(^) + U ^ Cicbits(-») + debits iff (23) 

U ^ Cicbits(-0 + debits iff (24) 

U ^ Cicobits(^) + debits iff (25) 

C 2 cobits(^) + U ^ Ci cobits(^) + (E+C 2 ) ebits (26) 



and 



d cbits(-») + C 2 cbits(<-) + U > debits iff (27) 

U ^ debits iff (28) 
Ci cobits(-») + C 2 cobits(<-) + U ^ (E+d+C'2) ebits (29) 



In essence, the rates of unidirectional classical communication with arbitrary amount of entanglement assistance (or 
generation) are not increased by (in)coherent classical communication in the opposite direction, except for a trivial 
gain of entanglement when the assisting classical communication is coherent. 

Proof: Using superdense coding to send 2 cobits and supplying the required 1 qubit of quantum communication by 
teleportation (using 2 cbits +1 ebit), we have 

lcbit+lebit ^ 1 cobit . (30) 

The above resource transformation is exact and does not require large blocks. Thus, composing it with other protocols 
poses no extra problem. 

For the first part of the theorem, Eq. (23) Eq. (24) follows from how Ref. [8] characterizes the set of (C\,E) that 
satisfies Eq. (23). Although the proof in Ref. [8] did not mention back communication, it can be easily modified to show 
that free classical communication from Bob to Alice does not change the capacity. In essence, the optimal tradeoff 
curve between C\ and E has an upper bound that remains valid in the presence of back classical communication, and 
the same bound is achieved by a protocol that uses no back classical communication. A complete proof of this fact 
will also appear in Ref. [22]. 

Ref. [8] also proved that Eq. (24) Eq. (25), and it is trivial that Eq. (25) => Eq. (26). Finally, Eq. (26) => Eq. (23) 
because of Eq. (30). 

For the second part of the theorem, Ref. [6] proved that Eq. (27) => Eq. (28). It is trivial that Eq. (28) Eq. (29). 
Using Eq. (30), Eq. (29) Eq. (27). 
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IV. ACHIEVABLE REGIONS FOR BIDIRECTIONAL COMMUNICATION 

Bipartite unitary gates can be used for several inequivalent purposes simultaneously, including some (possibly different) 
forms of forward and backward communications and entanglement generation. It is thus natural to define their 
capacities in terms of achievable rate regions (in 3-dimensional space) and trade-off surfaces. 

For example, let CCE be the achievable rate region {{C\,C 2 ,E) : U ^ C\ cbits(— >) + C2cbits(^) + debits}, and 
OOE be the achievable rate region {(C\,C 2 ,E) : U ^ G\ cobits(^) + C*2Cobits(^) + debits}. Theorems I and 2 
provide a mapping between CCC and 0,015 : 

(Ci, C2,E) e CCE <^=> (Ci,C 2 ,£;-min(Ci,0)-min(C 2 ,0)) G OOE. (31) 

Finding relations between different capacity regions will simplify our study of capacities of bipartite unitary gates and 
elicit their nonlocal properties. 

As a second example of relation of achievable regions, consider remote state preparation, which is the ability to 
prepare a quantum state \ip) in the laboratory of the receiver, assuming that the sender has a classical description of \tp) 
(assuming pure states for simplicity). We claim that the achievable region RRE for two-way (but independent forward 
and backward) remote state preparation is the same as CCE. To prove this, first note that oo cbits > n remote qubits > 
ncbit, where n remote qubits denotes the ability to remotely prepare an n-qubit state. Combining this with the fact 
that even unlimited back-communication does not improve classical capacity implies that RRE C CCE. On the other 
hand, Ref. [8] showed that n coherent bits > n remote qubits. Thus the first quadrants (Ci,C2 > 0) of RRE and 
OOE (and thus CCE) are the same, and the other quadrants of RRE are related to OOE the same way that CCE 
is: backwards cobits can be used to generate entanglement, but free backwards remote qubits do not improve the 
forward capacity. This means that RRE = CCE. 

Similarly, define QQE to be the region {(Qi,Q 2 , E) : U > Qi qubits (— >) + Q2qubits(^) + debits}, corresponding 
to two-way quantum communication. We can also consider coherent classical communication in one direction and 
quantum communication in the other; let QOE be the region {(Qi,C 2} E) : U ^ Qi qubits (— >) + C 2 cobits(^) + 
debits} and define OQE similarly. 

Ref. [8] related the one-way tradeoff curves OE and QE, defined as OE = {(C,E) : {C,0,E) e OOE} and QE = 
{(Q,E) : {Q,0,E) e QQE}. There it was claimed that 

(Q,E) e QE^ (2Q,E-Q) e OE. (32) 

We now rephrase the proof of Eq. (32) in a form that readily extends to a relation between entire achievable rate regions 
(for different types of bidirectional communication). Eq. (32) is due to the equivalence 2 cobits = 1 qubit +1 ebit. Note 
that this equivalence involves resource transformations that are exact and do not require large blocks. Thus, composing 
these transformations with other protocols poses no extra problem, and the equivalence can be used "freely." To prove 
Eq. (32), choose any (Q,E) e QE. Then U > Q qubits +E cbits = 2Qcobits+(S - Q) cbits, so (2Q,E- Q) € OE- 
Conversely, if (2Q, E - Q) e OE, then U > 2Q cobits +(E - Q) cbits = Q qubits +E ebits, so (Q, E) e QE. 

Note that the above argument still works if we replace U with a different resource, such as U — Q 2 qubits (<—). 
Therefore, the same argument that proved Eq. (32) also establishes the following equivalences for bidirectional rate 
regions: 

(Q 1 ,Q 2 ,E) e QQE ^ (2Q 1 ,Q 2 ,E-Q 1 ) e 0.QE 

t t ■ (33) 

{Qi,2Q 2 ,E-Q 2 ) e QOE (2Q 1 ,2Q 2 ,E- Q 1 - Q 2 ) e 0>OE 

Finally, Eq. (31) further relates QQE, QCE, CQE, CCE, where QCE and CQE are defined similarly to QOE and 
OQE but with incoherent classical communication instead. 

Thus once one of the capacity regions (say OOE) is determined, all other capacity regions discussed above are 
determined. 



APPENDIX A: WHY WE CANNOT USE THE TECHNIQUES IN REF. [8] 

In this appendix, we review the proof of Prop. 1 in Ref. [8] (the unidirectional communication analogue of Theorem 
1) and show how it breaks down when applied to two-way communication. 
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We first review HSW coding [16], since the proof of Prop. 1 in [8] is based on it. Given a channel which maps a classical 
input i to a quantum state pi, the HSW theorem states that its classical capacity is C := max p S('^2 i PiPi) — PiS(pi), 
where the maximization is over probability distributions p and S(p) := — Trplogp is the von Neumann entropy. The 
HSW theorem can be proved by random coding followed by expurgation. That is, we choose 2™( c ~ (5 ™) length n 
codewords according to the product distribution p n (h, . . . , i n ) — p(ii) ■ ■ -p(i n ) (with 5 n — > as n — > oo). Then with 
high probability the codewords will on average be almost perfectly distinguishable from one another. We then discard 
(or "expurgate" ) the worst half of the codewords in order to signal with asymptotically vanishing maximum error at 
a rate approaching C. 

Instead of choosing codewords according to p n , we could instead randomly choose typical sequences (meaning that 
the frequency of a letter i is npt ± 0{\fn)). In fact, since there are only poly(n) different type classes, we can choose 
all our codewords to be the same type and still achieve capacity C asymptotically. (The "type" of a string denotes 
the number of times each letter appears in the string.) 

Now we review the application of the HSW theorem to coherent communication in Prop. 1 of [8]. Given a gate U 
such that U > Ccbits(— >), we know (similar to Eq. (5)) that there exists a sequence of unitary protocols V n , each 
can communicate a bit string of length w n(C — 8 n ) bits up to an error of e„ for <5„ — > 0, e n — > 0. V n can be viewed as 
a channel with HSW capacity w nC, i.e., by HSW coding, V n can be used k times, sending w nkC bits with overall 
error rate vanishing as k — > oo. (This idea was used in [17] to bound the size of the ancilla systems used in unitary 
gate communication.) 

Let p be the distribution that almost achieve the HSW capacity. Let a = (oi, ■ ■ ■ , at) be any HSW codeword. Running 
V n k times produces the state \<p) = ® i=1 V n \o-i) & ■ Alice could have copied the input before the protocol, and by 
the construction of the HSW code, Bob can extract a with negligible error and disturbance to \ip), and Alice and Bob 

will have possession of a state which is ke n close to |a)A |a)Bi ®i = i(^ , n| a i))A 2 B 2 - The state \a) in Ao and Bi will 
allow Alice and Bob to coherently reorder the k copies of V n \ai) (with preagreed total order of the set of all nC-bit 
words). The reordered state has no information on a except for the letter frequency. Thus, when all a = (a±, ■ ■ ■ , ah) 
are of the same type, the reordered state becomes independent of a and can be discarded without breaking coherence 
of the communication of \a). Or when all a are typical sequences, the small information on a can be removed with 
0(Vk) qubits of communication. Here, k and n are independent, so that indeed ke n — ► 0. 

(The original form of the HSW theorem in which we simply choose random codewords according to p n and expurgate 
causes a problem in this application. With high probability, the codewords are typical, but some codewords can be 
highly nontypical, with corresponding ancilla that cannot be made identical to a "typical ancilla" using negligible 
resources.) 

The same-type HSW coding technique cannot be easily applied in the two-way case. Even if Alice only uses HSW 
codewords \a) of the same type and similarly for codewords \b) of Bob, the joint string (a, b) :— ((ai, bi), . . . (a/t, bk)) 
need not have the same type. With high probability (a, b) will be typical, but some are far from typical. Worst still, 
these are composite codewords that depend jointly on a and b and cannot be expurgated by independent expurgation 
of individual codewords used by Alice and Bob. 

Thus we obtain the strange situation where the average error is small, but we cannot make the maximum error 
small because expurgation requires a linear amount of communication. A similar problem was found in bidirectional 
classical channels, where the achievable capacity regions are different depending on whether average or maximum 
error is considered [23] . Classically, this separation between achievable average and maximum error occurs only when 
we restrict to deterministic encodings; Ref. [15] points out that the capacity regions for maximum and average error 
are the same when we let randomness be introduced into the encodings. The main result of our paper can thus be 
thought of as a coherent version of Ref. [15]. 



APPENDIX B: IMPLICATIONS ON THE DEFINITION OF COHERENT CLASSICAL 

COMMUNICATION 

There are two ways to define a cbit. One is in terms of an abstract operation |x) A — > |a:) B |x) E for x £ {0, 1}. Another 

is more operational, that some sequence of operations V n can send n cbits with error e„ — > if V n (\x) A ) fS \x) B , 
for x an n-bit string. The fact that the operational and abstract definitions are equivalent allows us to think about 
classical communication in both ways interchangeably. 

Similarly we can define a cobit either as an abstract operation |x) A — > |x) A |x) B for x € {0, 1}, or by saying that V n 
can send n cobits with error e„ — > if V n can send n cbits with error e n and V n is an isometry. By Prop 1 of [8] , 
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these definitions are equivalent for one-way communication. Thm 1 of this paper shows that these definitions are now 
equivalent for two-way communication. This justifies the name "coherent classical communication" ; a cobit really is 
no more and no less than a cbit sent through coherent means (i.e. a unitary gate or isometry). 
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